Appendix for Text-Image Topic Discovery for Web News Data
نویسندگان
چکیده
In this appendix, we will give detailed derivation, proof of convergence, complexity analysis, parameter analysis, and convergence curves for the proposed method. I. FORMULATION We formalize joint text-image topic discovery by the following optimization problem: min F≥0,G≥0 ∥∥X−GFT∥∥ 2,1 + λ 2 M ∑
منابع مشابه
Text-Image Topic Discovery for Web News Data
We formally propose a new application problem: unsupervised text-image topic discovery. The application problem is important because almost all news articles have one picture associated. Unlike traditional topic modeling which considers text alone, the new task aims to discover heterogeneous topics from web news of multiple data types. The heterogeneous topic discovery is challenging because di...
متن کاملA New Document Embedding Method for News Classification
Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...
متن کاملArabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کاملInformation Discovery based on Multi-granularity Text Fusion
In this paper we introduce a new information discovery algorithm Multi-granularity Text Fusion (MGTF) on the Web. Granularity means the length of News relevant web documents, such as News web pages, Blog and Micro Blogs, which comes from web uses. The longer the text is, the higher of the granularity it has. Given a topic query on the Internet and the results of different granularity and time-s...
متن کاملDiscovering and Tracking Events From News, Blogs and Microblogs on the Web
Using three data sources, news, blogs, and microblogs, this study proposes a framework for discovering and tracking events embedded in free form online text. Existing methods for text mining are discussed for the three sources. Because three sources have different perspective, event analysis, region-topic model and rare keywords are proposed respectively. In order to integrate three data source...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013